Amiga Developer CD 2.1

home *** CD-ROM | disk | FTP | other *** search

/ Amiga Developer CD 2.1 / Amiga Developer CD v2.1.iso / Reference / Amiga_Mail_Vol1 / Math / IntroIEEE < prev

Wrap

Text File | 1999-10-27 | 14.8 KB | 447 lines

(c) Copyright 1989-1999 Amiga, Inc. All rights reserved. The information contained herein is subject to change without notice, and is provided "as is" without warranty of any kind, either expressed or implied. The entire risk as to the use of this information is assumed by the user. Introduction to 1.3 IEEE Double Precision Libraries by Dale Luck The basic double precision IEEE library has been rewritten for V1.3. The new library is up to 4 times faster than the old one that came with V1.2. There were also several bugs fixed. And the routines now produce slightly more accurate results. I've listed some benchmarks comparing the two versions of the libraries at the end of this article. Besides the faster software emulation of floating point, the new IEEE math library recognizes and uses the 68020/68881 processor combination and will use the special floating point instructions available. Also, if an auto-configured math resource is available, it will use that as well. Typically, this resource would point to the base of a 68881 designed as a 16 bit IO port. But it could be another device as well. With the new library, you also have the ability to programmatically trap math errors such as overflow and divide by zero. Your program can now ignore them or take suitable action without visiting the GURU. In addition to a new version of the basic mathieeedoubbas.library, a second library supporting transcendental functions has been added. The name of the new library is mathieedoubtrans.library for IEEE double precision transcendental library. It supports the same functions as the transcendantal library for the Motorola fast floating point, such as sine, cosine, square root, etc. This library also can identify and use the 68020/68881 combination or other math resources. And it has a very fast software square root routine. When Should You Use These Libraries? These libraries have been benchmarked as the fastest IEEE double precision libraries available on the Amiga as well as outperforming almost all other software math libraries in the Amiga class personal workstation market. If you need the precision of IEEE double, and wish to have a transparent improvement in speed when your programs run on machines with math coprocessors, then you should use these libraries. All the decision making is done by the library when it is first initialized and it will use the fastest available resources to do your math. You only need one program to support a standard Amiga, a 68020/68881 Amiga, or a external math coprocessor Amiga. It works automatically. When Should You Avoid These Libraries? If you don't need double precision, use the Motorola fast floating point routines. As you can see from the benchmarks, the Motorola routines are still quite a bit faster. If you want your math to be the fastest possible, you will want to use the new instructions available on the 68020/68881 directly in your code. In that case, you would not need the IEEE libraries. However this would prevent your code from running on conventional 68000 based Amigas unless you supply different versions of your code for each configuration. Floating Point Formats Here's a chart comparing the various methods of representing floating point numbers used by Amiga system software. The IEEE double precision libraries operate on 64 bit quantities. The Motorola FFP libraries use 32 bits. Note that there is a "hidden" bit in the fraction part of IEEE numbers. Since all numbers are normalized, the leading 1 is dropped off. Motorola Single Double Field Size (bits) FFP IEEE IEEE Sign 1 1 1 Exponent 7 7 11 Fraction 24 23+1 52+1 Total 32 32 64 Minimum (+) number 5.4e-20 1.3e-38 2.2e-308 Largest (+) number 9.2e+19 3.4e+38 1.8e+307 Minimum (+) number n/a 1.4e-45 4.9e-324 (denormalized) Denormalized means reduced in precision so that numbers closer to zero can be represented. Floating Point Representation +--------+--------+--------+--------+ |ffffffff|ffffffff|ffffffff|Seeeeeee| Motorola FFP +--------+--------+--------+--------+ +--------+--------+--------+--------+ |Seeeeeee|ffffffff|ffffffff|ffffffff| IEEE Single +--------+--------+--------+--------+ IEEE Double +--------+--------+--------+--------+--------+--------+--------+--------+ |Seeeeeee|eeeeffff|ffffffff|ffffffff|ffffffff|ffffffff|ffffffff|ffffffff| +--------+--------+--------+--------+--------+--------+--------+--------+ S = Sign bit f = fraction bits e = exponent bits The scheme used in IEEE floating point representation includes a few "special" numbers. Certain patterns of bits are used to represent exceptions: o NAN 'Not A Number' (result of 0/0) o INF 'Infinity' (result of 1/0) There are other assigned patterns in addition to these two. Using the Libraries The new IEEE libraries should be placed in the :libs directory. Use the mathieeedoubbas.library to replace the old library of that same name. The mathieeedoubtrans.library is an all new addition. Code that calls routines in these libraries will have to be linked to the new .lib files which also have awkward names. They are mathieeedoubbas_lib.lib and mathieeedoubtrans_lib.lib. And there is a new .fd file for the transcendental functions. Using the IEEE routines is straight forward - they are a standard library. Simply open the library, use the routines and close the library when you are done. For example, to use the Sine routine: /* IEEE Sine Routine */ /* Compile under Lattice 4.0 by linking with c.o + */ /* mathieeedoubbas_lib.lib + mathieeedoubtrans_lib.lib */ /* + lcm.lib + lc.lib + amiga.lib */ double IEEEDPSin(); extern int MathIeeeDoubBasBase; int MathIeeeDoubTransBase; void main() { double x=0; MathIeeeDoubBasBase=OpenLibrary("mathieeedoubbas.library",0); if(MathIeeeDoubBasBase==0) exit(0); MathIeeeDoubTransBase=OpenLibrary("mathieeedoubtrans.library",0); if(MathIeeeDoubTransBase==0) { CloseLibrary(MathIeeeDoubBasBase); exit(0); } x=IEEEDPSin( (double) 60 ); printf("sin 60 = %e\n",x); CloseLibrary(MathIeeeDoubBasBase); CloseLibrary(MathIeeeDoubTransBase); } Hardware Developer Information. To make use of CBM's standard peripheral support for 68881 you must design your peripheral to autoconfig. Your autoconfig software must create a resource and add it to the resource list. The name of this resource is "MathIEEE.resource". The IEEE library will attempt to open this resource. If it finds it, it will extract the BaseAddr pointer and copy it into its library structure. If the BaseAddr pointer is non-null it will use a different list of routine entry points when the IEEE library is initialized. After the IEEE library is initialized, the library again checks the resource for alternate function bits in Flags of the resource. The Basic library only checks the DblBasAlt bit, and the transcendental library only checks the DblTransAlt bit. If they are set, the library routine will call the function whose address is in the corresponding Init field. The arguments passed are a6=sysbase, a1=resource and a2=mathlibrary. If your device is not a 68881 then you may need to use this. There are separate bits for different library capabilities in case your math resource is only able to handle a limited set of functions. This will let you tie a math processor in that may only provide addition, subtraction, multiplication and and division functions. The rest of software will use it transparently by calling your alternate routines. Amiga does not provide for arbitrating a math accelerator in a multitasking environment. Therefore, you must provide your own support for this when your device autoconfigs. The only exception is the 68020/68881 combination where support for that has been standard since V1.2. Arbitration usually involves saving and restoring the state of you hardware device between task switches. We recommend that you look at the tc_Switch and tc_Launch vectors in the task data structure. These are called each time control transfers from one task to another. Remember not to assume that you are the only process needing to use those vectors. The resource data structure is as follows: STRUCTURE MathIEEE,LN_SIZE UWORD MathIEEE_Flags ULONG MathIEEE_BaseAddr ; for standard 68881 support ULONG MathIEEE_DblBasInit ; something else besides 68881 ULONG MathIEEE_DblTransInit ; something else besides 68881 ULONG MathIEEE_SnglBasInit ; something else besides 68881 ULONG MathIEEE_SnglTransInit ; something else besides 68881 LABEL MathIEEE_sizeof * * Bits for MathIEEE_flags. All unassigned bits must be 0 * BITDEF MathIEEE,DblBasAlt,0 ; alternate Basic library BITDEF MathIEEE,DblTransAlt,1 ; alternate Trans library BITDEF MathIEEE,SnglBasAlt,2 ; alternate Basic library BITDEF MathIEEE,SnglTransAlt,3 ; alternate Trans library The MathIEEE resource structure may grow in the future. Extensions will be added as Amiga, Inc. adds new standards such as 80 bit extended format. The 'Init' entries in the math resource structure are only used if the corresponding Bit is set in the Flags field. So if you are just a 68881, you do not need the Init entries. Make sure you have cleared the Flags field. This should allow us to add Extended Precision later. For Init users, make sure you add yourself into the Open/Close/Expunge vectors for this library. The library structure that is used is tentatively laid out as shown below. I say tentatively because the name of the entries may change yet. The order of entries, their usage and size will not change. Naturally we may add new fields to the end. STRUCTURE MI,LIB_SIZE ; Standard library node UBYTE io8_Flags ; is this 68881? UBYTE io8_pad ; line up to next 32bit boundary ULONG io8_68881 ; ptr to io68881 base ULONG io8_SysLib ; ptr to SysBase ULONG io8_SegList ; ptr to this SegList ULONG io8_Resource ; ptr to mathIEEE.resource ULONG io8_opentask ; called when task opens ULONG io8_closetask ; called when task closes LABEL MI_SIZE Of particular interest to hardware developers are the opentask and closetask entry points. These functions will be called when a task calls OpenLibrary and CloseLibrary. This will give the vendor the opportunity to set up any per task initialization necessary. The Amiga library presently sets them up as NOPs in the case of straight emulation. It puts the 68881 initialization code in there for the 68020/68881 as well as the peripheral 68881. That initialization code currently sets up rounding modes and interrupt requests. If you need to override the defaults, you will have to set the appropriate Alt bits in the Resource structure and overwrite the opentask/closetask fields when your AltInit function is called. The OpenLibrary routine checks the return value of opentask for errors. If a nonzero is in d0.l then OpenLibrary will return 0 to the task trying to OpenLibrary. On the 68020/68881 some new exceptions are generated. Unfortunately the V1.2 operating system does not properly initialize these. For users of the new ramkick/A2024 system, the fixes have been added to the exec.library. For the rest we provide a program to run during your startup sequence to initialize the vectors and redirect processing back to exec when the new exceptions occur. This is only necessary on 68020/68881 systems. Benchmarks This section contains some benchmarks comparing the performance of the various Amiga math libraries. Use these as a guide when selecting the math routines to be used for your application. All these benchmarks show the reults when compiling under Greenhill's C. The results you get with another compiler will vary. How does V1.3 stack up to V1.2? A Comparison of Software V1.2 V1.3 V1.2 IEEE IEEE MathFFP Float 10000 (secs) 92.14 45.22 17.64 256000 (secs) 580.58 282.52 136.78 Calcpi (kflops/sec) 2.07 4.93 11.14 PI error -5.5e-14 -1.4e-11 6.1e-5 Whetstone (kwhets/sec) 12 24 78 Savage (secs) N/A 470 98.2 System tested: A1000, 512k chip memory, 1 external floppy Transparent Increase in Speed V1.3/000 000/881 020/881 Float 10000 (secs) 45.22 19.18 13.46 256000 (secs) 282.52 179.98 122.46 BCalcpi (kflops/sec) 4.93 7.89 11.78 PI error -1.39e-11 -2.78e-11 -2.78e-11 Whetstone (kwhets/sec) 24 81 124 Savage (secs) 470 20.4 15.2 error -6.9e-7 -5.6e-7 -5.6e-7 Systems tested: V1.3/000 was an A1000 with 512k. 000/881 was an A1000 with 512k plus 2M and Microbotic's "881 Starmath 020/881 was an A2000 with CSA's 68020/68881, 2M memory and a 2090a Penultimate Speed Tests: Comparison of Speed Using Inline F instructions V1.3/000 020/881 Float 10000 (secs) 45.22 0.26* 256000 (secs) 282.52 15.86 Calcpi (kflops/sec) 4.93 81.3 Whetstone (kwhets/sec) 24 459 Savage (secs) 470 4.6 Systems tested: V1.3/000 was an A1000 with 512k and 1 external floppy. 020/881 was an A2000 with CSA's 68020/881, 2M memory and a 2090a. Note: Under this test, the 020/881 test code will not run on a standard 68000 based system. * The Greenhill compiler may have optimized this benchmark to nothing. Penultimate Speed Tests, II: Inline Results With Fast 32-Bit Memory Inline Inline 020 020/881 030/882 020/881 030/882 Float 10000 (secs) 25.6 6.08 5.16 0.24* 0.18* 256000 (secs) 168.74 54.08 47.52 15.28 13.16 Calcpi (kflops/sec) 8.44 25.29 28.8 90.09 114.42 Whetstone (kwhets/sec) 39 263 291 673 889 Savage (secs) 320.8 8.4 7.6 4.46 3.98 Systems tested: 020 was an A2000 with CSA's 020 board running at 14 MHz. 020/881 was an A2000 with CSA's 020/881 board running at 14 MHz. 030/882 was an A2000 with CSA's 030/882 board running at 14/16 MHz. * The greenhills compiler may have optimized this benchmark to nothing.